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Abstract 

This paper approximates simulation models by B-splines with a penalty on high- 
order finite differences of the coefficients of adjacent B-splines. The penalty 
prevents overfitting. The simulation output is assumed to be nonnegative. The 
nonnegative spline simulation metamodel is casted as a second-order cone pro¬ 
gramming model, which can be solved efficiently by modern optimization tech¬ 
niques. The method is implemented in MATLAB/GNU Octave. 

1 Introduction 

People use computer simulation to study complex systems that prohibit analyt¬ 
ical evaluations, in order to have a basic understanding of the system, or to find 
robust decisions or policies, or to compare different decisions or policies [ 201 . 
Simulation is applied in various areas ii m mi and it is considered as one of 
the three most important operations research techniques 122) . Let y represent 
the response and x represent the input of a system. A simulation model can 
then be written as 

y = fix). 

In situations where the systems are so complex that even their valid simulation 
models can’t be evaluated in reasonable time, metamodels, or models of mod¬ 
els [H], are constructed to approximate the simulation models. Advantages of 
the simulation metamodel include “model simplification, enhanced exploration 
and interpretation of the model, generalization to other models of the same 
type, sensitivity analysis, optimization, answering inverse questions, and pro¬ 
viding the researcher with a better understanding of the behaviour of the system 
under study and the interrelationships among the variables” m- 
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Parametric polynomial response surface approximation is the most popular 
technique for building metamodels [5]. To determine and quantify the unknown 
or too complex relationship between the response variables and the experimental 
factors assumed to influence the response, in response surface methodolgy - 
introduced by Box and Wilson [8], a mathematical model is constructed to fit 
the data collected from a series of experiments, and the optimal settings of the 
experimental factors is determined mna ng. Usually the mathematical model 
is a first or second order polynomial, called a response surface. 

By Weierstrass approximation theorem, every continuous function can be 
uniformly approximated as closely as desired by a polynomial. Polynomials 
are easy to compute and have continuous derivatives of all orders. On the other 
hand, polynomials are inflexible: their values on a complex plane are determined 
by an arbitrarily small set [33 Theorem 3.23]; they oscillate increasingly with 
the increase in the order of the polynomials, while high-order is required for 
suitable accuracy in polynomial approximation; the Runge phenomenon m is 
the classic example of divergent polynomial interpolation. A polynomial fits 
data nicely near one data point may display repulsive features at parts of the 
curve not close to that particular data point. 

Approximation by splines (smooth piecewise polynomials) overcomes the in¬ 
flexibility of polynomial approximation. In practice, B-Splines, which are first 
thoroughly studied by [33], are widely used in approximation, as there are good 
properties associated with B-splines m- Especiallarly, compared with repre¬ 
sentations by splines in truncated power basis—defined as {(x — U) 3 + /j\\ (j = 
1,... fc)| for node t l: B-spline representations are relatively well-conditioned and 
involve fewer basis functions computationally. Let t = (t,;) be a nondecreasing 
sequence. The ith (normalized) B-spline basis function of order 1 for the knot 
sequence t is defined as follows: 


Bnt(x) = Xi(x) = 


tj ^ x ^ U+i 
otherwise 


When it can be inferred from the context, the knots t and variable x are omitted 
in the notations for B-spline representations. Denote 


Uik( x) 


k-l — ti ’ 


U 7^ ti+k— 1 
otherwise 


For k > 1, the ith B-Spline basis function of order k for knot sequence t can be 
obtained recursively by: 

Bik — i T (1 1- (1) 

The spline basis function Bm depends only on the knots t,.... ,ti+k- It is 
positive on the interval (U,U+fc) and is zero elsewhere. The Curry-Schoenberg 
Theorem [8] describes how to construct B-spline basis for the linear space of 
piecewise polynomial functions satisfying predefined continuity conditions based 
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on the muliplicity of the knots. This property simplifies the approximation of 
functions with required degree of smoothness, compared with truncated power 
spline approximation, where additional constraints on smoothness need to be 
included in the model. The de Boor algorithm fjj is a well-conditioned yet 
efficient way of evaluating B-splines. 


P-Spline approximation. To fit the metamodel with data collected from 
experiments, i.e., to find the parameters for the B-spline approximation, we 
conside P-spline regression [14] , The objective function of a P-spline regression 
combines B-splines with a penalty on high-order finite differences of the coeffi¬ 
cients of adjacent B-splines. Similar to the smoothing term in the loss function 
for smoothing spline regression [3U1 SU > the penalty in P-spline regression loss 
function prevents overfitting, i.e., the penalty reduces the variation of the fit¬ 
ted curve caused by data error. Compared with smoothing splines, P-splines 
are relatively inexpensive to compute and without the complexity of choosing 
the optimal number and positions of knots — too few data points causes un¬ 
der fitting while too many data points results in overfitting. An algorithm of 
determining the number of knots for P-spline regression is given in [22]. 

Let (cti) represent the B-spline coefficients. The second-order differences of 
the adjacent B-spline coefficients for the knot sequence t are 


i a j a j- 1) ( a j —i Uj- 2 ) — a.j 2aj-i + aj-2- 


Denote the parameter controlling the smoothness of the fit by A. The least 
squares objective function (loss function) of the regression of m data points 
(a yi) using n B-spline basis functions of order four with a penalty on second- 
order differences of the B-spline coefficients, i.e. the P-spline regression loss 
function for fitting the metamodel studied in this paper, is 


min 

O' 


E 


Vi - OijB j4 (xi) 
3 =1 


+ ^E - 2a i-i + a 3- 2 ) 


3=3 


( 2 ) 


Nonnegative model fitting. In many applications, the response of the sys¬ 
tem is known or required to be nonnegative or above some threshold; for in¬ 
stance, when the output of the system describes duration, productions, prices, 
demand, sales, wages, amount of precipitation, probability mass, etc; see El ESI)- 
Because of the noisy or tendency in the data, quite often, the fitted curve doesn’t 
exhibit nonnegativitiy, even though it should be. For instance, let ( Xi,yi ) be 
the monthly precipitation amount at some region, where Xj is the variable for 
months and yi is the rain fall amount. The rain fall amount may be decreas¬ 
ing during some period till at some months there is little or no rain, then it 
may increase again. Because of the increase and decrease trend before and af¬ 
ter these certain months, the fitted curve may have negative values at these 
points for y [25], Even if the data points are nonnegative, without imposing 
the nonnegativity constraints, the resulting models may take negative values at 
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some areas [35]. In [3] a numerical example of cubic spline approximation of 
arrival-rate for an e-mail data set shows that the maximum likelihood spline 
takes negative values in a significant time period with all positive data points, 
and the estimation problem may even be unbounded and thus ill-posed. 

To obtain a satisfiable and sometimes meaningful model, the nonnegativity 
constraint on the output needs to be imposed on the regression. The nonneg¬ 
ative cubic spline approximation in truncated power basis is considered in [3]. 
Constrained smoothing spline approximation is studied in m , but they ac¬ 
knowledge the computational difficulty in their approach. Since the B-spline 
basis functions are nonnegative, imposing positivity on B-spline coefficients m 
or integrating B-splines with positive coefficients (I-spline [2Sj) preserves posi¬ 
tivity in regression. But this approach excludes some classes of positive splines 
and thus reduces the accuracy of the regression. It is proved in m that errors 
in approximation of nonnegative functions by B-splines of order k (degree < k) 
with nonnegative coefficients are bigger in magnitude compared with errors in 
approximation by nonnegative splines of the same order, and the difference be¬ 
tween the magnitudes of the errors increases with the order of the splines. The 
two approximation schemes give errors of the same magnitude only if k < 2, i.e. 
approximation by piece wise constant or piece wise linear functions. Because of 
the approximation and computational advantage of P-splines, this paper focuses 
on nonnegative P-spline approximation. 

To simplify notation, in this paper, we concatenate vectors row wise by 
and concatenate vectors column wise by for instance, the adjoining of vectors 
x, y , and z can be represented as 



2 Nonnegative Cubic Polynomials 

In the context below, matrices are represented by capital letters: A = [a^-], 
where the element of matrix A at both of its ith row and ;;th column is denoted 
as dij . Let A 0 represent the symmetric matrix A being positive semidefinite. 
For two matrices A and B of the same size, let A* B denote their Hadamard 
product: 



By Markov-Lukacs theorem [2B], a cubic polynomial p(x) = fox 3 + fox 2 + 
fox + fo is nonnegative on the interval [ti,ti+ 1 ] if and only if there exist 
ci, C 2 , di, d 2 £ R such that p(x) can be represented as 


p(x) = (x - U)(c\x + c 2 ) 2 + (tj+i - x)(dix + d 2 ) 2 . 


Denote 
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Then by JS Theorem 1], the above representation is equivalent to: existing 
C = [ Cij ] y 0, D = [dtj] y 0 such that 

( c\X + C 2) 2 = C • if, (d±x + CL 2) 2 = D • H. 

Because C y 0 is equivalent to: c n > 0, C 22 > 0, cf 2 < C 11 C 22 , p(x) is non¬ 
negative on [ti,t i+ 1 ] if and only if there exist Cn, C 12 , C 22 , dn, di 2 , <^22 £ K such 
that 

P3 = C 11 — dn, 

P 2 = —i»cn + 2ci2 + ti+idu — 2di2, 

Pi = — 2 ti c 12 + C22 + 2tj_|_ 1 d 12 — C?22, 

A) = —tiC22 +ti+\d22 

C 11 , C 22 , dn, d22 > 0 
C?2 — c ll c 22, d 2 2 < dnd22- 

3 Nonnegative Representations By B-Splines Of 
Order Four 

Based on the definition of B-spline basis functions the ith B-spline basis 
function of order three for knot sequence t is 

Bi3 — 0Ji3UJi2Xi + [Ui3 (1 — £*4+1,2) + (1 — £*>i+1,3) £*4+1,2] Xi+1 + (1 — £*4+1,3) (1 — £*4+2,2) Xi+2- 
And the ith B-spline basis function of order four for knot sequence t is 
BiA = £*44-Bj,3 + (1 — £*4+l,4)-S»+l,3 

= 044043042X1 + [£*44 (<*43 (1 — £*4+1,2) + (1 _ £*4+1,3) 04+1,2) + (1 — 04+1,4) 04+1,304+1,2] Xi+1 
+ [<*44 (1 — £*4+1,3) (1 — £*4+2,2) + (1 — £*4+1,4) (04+1,3 (1 — 04+2,2) + (1 — t*4+ 2,3) 04+2,2)] Xi+2 
+ (1 — 04+1,4) (1 — 04+2,3) (1 — 04+3,2) Xi+3 

Hence, on the interval [U,U+ 1 ), the B-spline JV a.iBiAt(x) is 

104044043042 + o,;_ i [04-1,4 (04-1,3 (1 — 042) + (1 — o42) + (1 — 04,4) 043042] 

+ 04-2 [£*4-2,4 (1 — £*>4-1,3) (1 — W42) + (1 — 04-1,4) (04_i,3 (1 — 04,2) + (1 — 04,3) 04,2)] 

+ 04—3 (1 — £*4-2, 4 ) (1 — 04- 1 , 3 ) (1 — £*> 42 ) |xi( x ) 

Given a finite knot sequence t = (fi,..., t n ), define 

tl—k = ' ‘ ‘ = ^0 = ^1 ? tn = ^n+1 = • • • = ^n+/c• 

For u = 0, 1, 2,3 and v = i — 3, i — 2, i — 1, i, let o2, denote the coefficient of 
associated with a v of the polynomial on the interval [t+tj+i). 
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y aiBi 4 = y y “j E °S ** x- 


where for j < 0, define a : j = 0 and a y u ) = 0. In other words, 


i j=i —3 
(i) 


a=o 


3 

y, a[fx l = U)i4U>i3U)i2 
1=0 
3 

y o \ l }- iX l = UJi - 1,4 ((* 4 - 1,3 (1 - 04 2 ) + (1 - 043 ) 04 2 ) + (1 - u > i , 4 ) W i 3 Wj 2 
1=0 
3 

'y a [ l }-2 xl = w i-2,4 (1 — 04_+ 3 ) (1 — 04 >2 ) + (1 — Wi_i,4) (<*4-1,3 (1 — ,2) + (1 — 04 >3 ) Wj, 2 ) 

1=0 
3 

y a \* l _ 3 x l = (1 - 04-2,4) (1 - 04—1,3) (1 - 042). 

1=0 


Let l/(tj — ti) = 0 for tj 
below: 


ti■ 


Then ai',l 


can be represented in terms of t as 


bi~ 3 = 


(ti+l — ti-2){ti+l — ti-l)(t i+ l — ti) 

(i) i ft) oj- l „(i) j „(*) (i) .3 , 

°3,i-3 — —bi-3i a 2,i- 3 — 3ti+i6i_ 3 , a l,i-3 — ^i+1 a 2,i—3’ a 0,i-3 ~ ^i+l^i-3 


&i-2 


i 


(ti+2 — ti-l)(ti+i — ti-l)(ti+i ^ ti) 


, fri-l 


! 


(ti +2 ^ L:-l)(L+2 ^ ti)(t*+i — t,;) 


Q-3 ^_2 = bi— 3 + b-i —2 + bi— i 
_ 


l j_ 2 — —{ti- 2 + 2fi + i)6,:- 3 — (tj-1 + ti+l + ti+ 2 )&i_ 2 — {ti + 2t,;+ 2 )6i_i 


<4*1-2 — (2tj_ 2 tj+i + ti +1 )6i_ 3 + (tj-itj+i + tj-itj+2 + ti+iti +2 )6i_ 2 + (2tjtj+ 2 + tf +2 )bi -1 
®0i-2 = ti- 2 t i+1 bi- 3 ti — ifi + iti+2t>i_ 2 tit i+2 bi-i 


*0,i-2 

&i_4 = 


3 m+m+2 C 
1 

(ti+3 — ti)(ti +2 — ti)(ti+l — ti) 


1, i+2 u <—1 


a 3 j_i = —£>i —4 - bi-2 - h -1 

<4,1-1 = (ti+ 3 + 2ti)&i_4 + (tj+i + 2ti_i)6i_ 2 + (ti+ 2 + ti + ti—i)fei_ 1 


i_i — —(ti + 2ti+ 3 ti)foi_4 — (2ii+iti_i + t,;_i)&i_ 2 — [ti+ 2 (ti_i + ti) + ti_iti]6i_i 


<4*1-1 — ti +3 t?bi-4 + ti+itf_4&i— 2 + ti+ 2 titi-i&i-a 

(i) i ( i ) oj- l (*) j (») (i) ,3, 

a 3,i = Oi-4, a 2 ,i =-3ti0i_4, a)) =-tiO^t, a 0 j =-t, ; 6,-4 

Denote 

^i = ti+l ti* 
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For equally spaced knot sequence, i.e., A = A* = Aj +1 = the above 


expression for a 

can be simplified: 





(i) 

a 3 ,i -3 = 

1 

ft) 

a 2 ,z -3 = 

ti + A 

(i) 

a i j -3 = 

(ti + A) 2 

(i) 

a oj -3 = 

(ti + A) 3 

6A 3 

2A 3 

2A 3 

6A 3 

a « - 
u 3,i-2 ~ 

1 

(t) 

a 2 ,i -2 = 

3 ti + 2A 

(t) 

a l,i -2 = 

3 tf + 4t,;A 

(t) 

a 0 ,i -2 = 

—3t 3 - 6f 2 A + 4A 3 

2A 3 

2A 3 

2A 3 

6A 3 

(») 

a 3 ,i-l = 

1 

(t) 

° 2 ,i-l = 

3^ + A 

(i) 

= 

—3t 2 - 2UA + A 2 

(i) 

%'i-l = 

3 t 3 + 3f 2 A - 3f*A 2 + A 3 

2A 3 

2A 3 

2A 3 

6A 3 

(i) 

a 3,i = 

1 

6A 3 

(i) 

a 2,i = 

U 

2A 3 

(t) 

a l,i = 

t 2 

L i 

2A 3 

(t) 

a 0 ,i = 

t 3 

u i 

6A 3 


By ([3]), the B-spline ]T\ a,B jA is nonnegative on the interval [ti,ti+ 1 ) iff there 

exist ciV, < 4*2 i ) < 4*2 , such that 


V a (i) a- -c W -d W 

j=i—3 

a 2 l j a J = -ti c n + 2c-[l + tj+idu - 2(4*2 , 

i=*-3 

a lJ a i = _2 ^ c 12 + C 22 + 2i i+1^12 - d 22 , 

j=»—3 


E „(*)«, — _+-(») 1 f A*) 
a Oj — H l 22 “T °t+l u 22 

0—i -3 


M « j(<) j(») > Q 

*"1 1 1 Boo , U-1 1 , U.99 VJ 


(4*2) 


11 5 °22 5 “'ll 5 ^22 — 

2 / n / • \ / / . N \ 2 


— C 11 c 22 7 


(<44)' 


<4i4i 


(4) 


The model. Adding constraints Q to the P-spline regression loss function ([ 2 ]), 
we obtain the formula for fitting the metamodel: 
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m 


n 


2 


n 



i 



i 



j=i-3 


°li a 0 = ~ 2tiC 12 + C 22 + 2 U+ldi2 ~ 4: 2 1 


l 


(5) 


1 


E „(*)„, — _+«(*) I + ,/(*) 

u 0j U J — I/ I c 22 I °l+l u 22 



(* = 1, •••in¬ 

variable reduction. Let a denote the column vector containing all ap. a = 


(ati, ct 2 , ■ ■ ■, a n ) T . Denote 



Let c and d denote the column vectors containing all c)J ’s and a) - ’s: 

C = (ci, C 2 , ■ - ■ 7 C-n ) 7 d = (di, G?2, ■ ■ ■ 7 d n ). 


Constraints 0 contain 4 n homongenous equations and 7 n variables: a, c, 
and d. By Curry-Schoenberg theorem mm , the sequence Bi^ t , B n ^ t 
is a basis for the linear space of piecewise polynomials of order k with break 
sequence t that satisfies continuity condition specified by the multiplicities of 
the elements of t. Since the sub-matrix corresponding to a in the coefficient 
matrix of the constraints Q is the linear transformation from the B-spline basis 
to the truncated power function basis, the matrix corresponding to a has full 
column rank. 

Lemma 3.1. The coefficient matrix of the equalities in constraints 0 has rank 
at least 4 n — k + g, where k is the number of total multiple knots - counted with 
multiplicities, and g is the number of different multiple knots - counted without 
multiplicities. 
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Proof. The submatrix of the coefhcent matrix for the equalities in constraints Q 
with columns corresponding to c and d is block diagonal, where the *th block 
is: 

-1 1 

r = U -2 -t i+ 1 2 

i_ -1 - 2 t j+1 1 ■ 


ti 


—ti+i 


The block has rank 4 if ti ^ ti+i, and it has rank 3 if ti = U+i. Since the 
coefficient matrix of the equalities in constraints Q has 4n rows, the statement 
of the lemma follows. □ 


Corollary 3.2. If all the knots are distinct, then the coefficient matrix 

of the constraints 0 has full row rank. 

Therefore, given a distinct knot sequence t, we can use Gauss elimination to 
represent a by c, d and t in the constraints 0 . Since each a t relates to only ti , 
ti+ 1 , U+ 2 ) ti +3 hr constraints 0 , we can represent each cq by at most variables 
r (9 AO AO AO AO AO 

For equally spaced knot sequences, below are representations of 3 , a,_ 2 , a»_i, a^: 

For 4 < i < n, omitting the subscript of ti for simplicity, we have 


aJA 3 = 


„ t 2 22 t „ 
2 - + 3A +6 


QJi-i/A 3 _ — + — 


A 2 
t 2 
A 2 
t 2 
A 2 
2t 
3A 


lOt 

3A 

4t 

3A 


AO 

-11 






V3A 2 3 A J 


- ) C 12 + 


V 3A 3 3A 2 J 


u 22 


*11 


/ _ 2 X « _ / 5 \ , 


V3A 2 3A 2 A 


l 12 


V3A 3 3A 2 / 


,(0 

-11 


+ ( XT’ + X ) d\) + 


/ 2t 4 
V3A 2 + 3A 
8 1 


(i) 

C 12 


l '22 


*11 


( At 2 

V3A 3 


3A 2 


OLi- 2/A 3 = 


+ 


= _?L r <$ + „C 9 4- f JL 


3A Cn 


A , A _ 1 1 j. 

A 2 + 3A 3 1 11 


\3A 2 ' 3A J 

■(* 


'12 


A 
2t 
V3A 3 


+ JA ,« 

V3A 3 3 A 2 y 

2 

3 a 2 y 


QA2 + A fl 12 QA3 + QA2 °22 


\3A 3 

2 y (*) 

—txt C>v> 


/ 4t 2 14t 

V3A 3 + 3A 3 


3A 2 y 
2 
A 


22 


«12 -t- 


2 1 

3A 3 


1 

A 2 


e 

A 2 


At 

3A 


CXi-3/A 3 - ( -Tj + XT- ) C {1 + ( -TT4 + XT- 


A9 

-12 


+ 


2 a_ a 

A 2 3A 


+ - U 


X 11 


lot 4 
3A 2 + 3A 
/ 4t 2 20t 2 

\3A3 + 3A 2 “ A 


( 2t _ L_\ c b) 

13A 3 3A 2 y ° 


22 


l 22 


d: 


j(i) 

l 22 


l 12 


_A + d (i) 

3^3 ' Q A 2 I U 22 


3A 2 y 

( 6 ) 
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For i = 1: 


ol\ = 6A 3 


.(i) 

'ii 


7« 

ni 


3*i 


3if 


*? 


,(i) 

'ii 


/i) 

*n 


= *iC^ - 2cy2 - *2^u + 2 c ^12 ) ; 


>) - d (1) 

C 11 a ll 

>) - d (1) 
C 11 U 11 


= — 2tic^2 + C& + 2(*i + A)4 2 ; — d! 


7(1) 


(i) 

'22 i 


= ilC^ - (*1 + A)d& 


( 1 ) 


For » = 2: 

a, = 4f 2 A 2 cg ) + tA 2 ^ + (2A 3 - 4* 2 A 2 ) d™ - 4A 2 d< 2 2 > 
a 2 = 6A 2 (2t 2 + A) c<? + 12A 2 4 2 - 12A 2 * 2 df 1 ) - 12A 2 d^ 2) 


6A 2 


Ct2 


A - 2* 2 A - 3*1) c&> - 2* 2C ) 2J + + (3*1 + 2* 2 A - A 2 ) d 


,(2) L „(2) 


A - 2*2 L 
+ 2 (*2 + A) c*) 2 ; — d 22 
6A 3 




0(2 = 


^2 — ^A 2 + A 3 /3 - 


(*! + *1 - *2 A 2 + A 3 /3) c^ 2) - * 2 4 2 


„( 2 ) 


j(2)' 

*22 


- (*1 + *1 - * 2 A 2 + A 3 /2) 4i } + (*2 + A)dl 
For i = 3: 

ai = A (*1 - 2* 3 A) + A (2*3 - 2A) + Ac^ - A (A 2 - 4* 3 A + 1\) d 


(3) 

11 


+ A (4A - 2* 3 ) d^ - A d!& 
A 2 


(3) 


ai = 


A 2 - * 3 A + *§A - *§ L 


*1 - 2*1 A + 2*1 A - 


2< 3 A 2 


)■«-( 


- ( A 2 - 2* 3 A 


+ 2*lA))4 3 2 } - * 3 4 3) - (* 3 + *1A 2 - jj* 3 A 2 - 2* 3 A + 2*1 A + ^-)d! 


(3) 

11 


^A 2 + 2*1 A - 2* 3 A 


4^ + (^3 + A) c? 22 ^ 


(3) 

12 


a 2 = 2*1 Aqi + 4* 3 Ac 3 2 + 2A4 3) + (-2*lA + 4* 3 A 2 ) d + (4A 2 - 4* 3 A) dfj - 2A d\ 
a 3 = (3*lA + 6* 3 A 2 ) + (6* 3 A + 6A 2 ) + 3A4? + (3A 3 - 3*1 A) d£} } - 6* s Ad 

+ 6A 3 cf 1 ) - 6A 3 d£ } - 3A4 2 - 

Then we can replace in the objective of ([5]) by the following relation: 


j(3) 


22 
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ai = 6A 3 


d 1 ) _ 

■'ll 

6A 3 


i {1) 

l n 


Ot 2 — 


t 2 — ^ 2 A^ + A 3 /3 . 


(f 3 + t\ — t 2 A 2 + A 3 /3) c[ 2 ^ — t 2 C22 


„( 2 ) 


7(2) 


— (p2 + ~ ^2A" + A 3 /2) <3^2 1 + (^2 + A)g? 2 2 

a 3 = (34§A + 6t 3 A 2 ) + (6i 3 A + 6A 2 ) + SAc^ + (3A 3 - 3t§A) 4i } - 6t 3 Ad$ 


6A 3 c^ } - 6A 3 d[ 3 ? - 3A S$ 


i > 4 : 


2i 2 A 
f 2 A + 


22tA 2 

h ^“ 

lOfA 2 


+ 6A 3 


^8tA ; 22A 2 y (i) f (2t | 8A 


4i } + 


V 3 

4f 2 A 


3 

2tA 

~Y~ 


-12 


JO 


- 2A 2 ) d« - ( =: + )a: 

(7) 


2 t 5A 


j(0 


4 Second-Order Cone Programming 

Index vectors in M n from 0. A second-order cone (quadratic cone , Lorentz cone , 
or ice-cream cone ) in M" is the set 

Q™ = {a; = (x 0 ;x) elx R" _1 : x 0 > ||x||} . 

The rotated quadratic cone is obtained by rotating the second-order cone by 45 
degrees in the xq-x± plane: 

Q n = {x = (x 0 ; ii;i)elxRx R n ~ 2 : 2x 0 Xi > ||x|| 2 , x 0 > 0, aq > 0} . 

The nonnegative orthant is a one-dimensional second-order cone. Because a 
second-order cone induces a partial ordering, an n-dimensional vector x £ Q n 
can be represented as X As>„ 0. The subscript n is sometimes omitted when it 
is clear from the context. 

Second-order cone programming is an extension of linear programming. In 
second-order cone programming, one minimizes a linear objective function un¬ 
der linear equality constraints and second-order cone constraints where variables 
are required to be in the second-order cones. Let x% (i = 1,..., r) be vectors not 
necessarily of the same dimensions. Let c,; (z = 1,... ,r) and b be vectors and 
Ai (i = 1,..., r) be matrices. The standard form second-order cone program¬ 
ming problem is 

min z Yh=i c J x i 
subject to 1 AiXi = b 
Xi hQ 0 

Second-order cone programming has many applications. A solution to a 
second-order cone programming problem can be obtained approximately by in¬ 
terior point methods in polynomial time of the problem data size. See [2] for 
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a survey of applications and algorithms of second-order cone programming. In 
addition, the complexity of an interior point method for second-order cone pro¬ 
gramming doesn’t depend on the dimension of the second-order cone. 

The metamodel fitting problem ff can be casted as a second-order cone 
programming problem. Below are two constructions. 

Model I 


min 2: 

a,z 

i 

subject to ^ a 3 j a j = c n ~ dn ) 
j=i- 3 

Y. a^jCtj = -tjCii + 2<4*2 + U+idii - 2d$, 
j—i—3 


z,y 1 


V n {i) n - -2t r {i) + r (i) + 2f 


l=i-3 



^ ^ a Oj a j ~ ti c 22 + ^1+1^22 
j-i-3 


€ Q 2 > (dn 7 <4*2 , \/2di*2 ^ € Q 2 7 


(* = 1,... ,n) 


n n 

Y ajB jA {x 1 ), ...,y m ~Y a o B A ), VA (a 3 - 2 o 2 + Q!i), • • •, VA (a n - 2a n _i + a„_ 2 ) 
i=i i=i 


Model II. The square of the L 2 norm of a vector a: £ R ra no more than the 
value of y £ K. can be represented as a second-order cone constraint: 

IMI2 < y (y +!; 2/ - 1 ; 2®) £ Q n+2 . 

Therefore, problem ([ 5 ]) can also be formulated as the following second-order 
cone programming model: 


T 

Qm+ 
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mm 

Ot.lL.V 


Y u ' + X Y‘ 


j =3 


subject to ^ a 3 j a j = 4i* — 44 
j-i-3 

Y a 2j a i = -t * C ll + 2c 12 + *i+l4l ^ 2d«, 

j=i-3 


Y 4j “j = ~ 2 *i C 12 + C 22 + 2^+1^12 - 44 


j = i-3 


^ ' a 0j a j — ti c 22 "b U+1^22 

j-i-3 

(c^,4lV2c^y gq 2 , ( 4444724 ^ e (* 


„(i) 


j(») 


= 1, 


+ 1) Up 1)2 I U— ^ ( ajBj±{x p ) 
i=i 


€ 2 3 , 0= l,...,m) 


K + 1, V g - 1, 2 (a g - 2a g _! + a g _ 2 )] T £ 23 , (q = 3,...,n). 


5 Numerical Examples 

We have implemented the nonnegative P-spline regression by second-order cone 
programming in MATLAB / GNU Octave IT5I . 


5.1 Parameter Selection 

We use equally spaced knots. For each fixed knot sequence, the smoothness 
parameter A is chosen to minimize the GCV (Generalized Cross-Validation) 
statistic [82] : Let d(A) be the optimal coefficients under parameter A. The 
average squared residuals using A for the regression is 


m 

ASR(X) = nr 1 

2 — 1 


Vi 


j =1 
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Denote fl e I” x (" 2 ) as 


1 

-2 

1 


D = 


1 

-2 1 

1 -2 1 
1 -2 
1 


Then the penalty term in the regression loss function can be represented as 
x T DD t x. Let X £ R mxn {t eno te the design matrix whose ith row is 

Xi — [-Sl,4,t i^Xi) R2,4,t(%i) ^n,4,t(*^i)] ■ 

The smoother matrix 5(A) is defined as 

5(A) = X{X T X + A DD t )~ 1 X t . 


Then the generalized cross validation statistic is 


GCV{ A) 


ASR( A) 

[1 — m _1 tr {5(A)}] 2 


The value fr{5(A)} measures the effective degrees of freedom of the fit. 

The number of knots is determined by the Akaike information theoretical 
criterion (AIC) [J, i.e, we run the algorithm with different number of knots and 
choose the number n for the model that has the minimum AIC: 


AIC = m < In 


^ - V' a o B 3*{ x i) I / m 


i—l 


3=1 


+ 1 


2 n. 


5.2 Numerical Examples 

The second-order cone programming model is solved via SDPT3-4.0 ]5^j through 
the YALMIP interface [23]. YALMIP is a modeling language that models the 
problems into standard second-order cone programs. SDPT3 and SeDuMi are 
state of the art software for second-order cone programming. The reformatted 
SDPT3 and SeDuMi for GNU Octave by Michael Grant are available at the 
repositories on GitHub. 

Below are some numerical examples with MATLAB. Data points are de¬ 
picted by blue the B-splinc fitted function is the green curve. We tested 
the method with number of internal knots from 4 to 19. For each knot sequence, 
the A is chosen to minimize GCV. The values of A tested are l.Oe— 4,1.0e — 
3,..., 10 3 ,10 4 . The number of knots is determined by minimizing AIC. 
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Density estimation. Figure |T] shows the output of our method for density 
estimation. 

For the Poisson probability density function with mean 20: 


m 


20 fc e -20 

k\ 


based on GCV and AIC , the best model has 5 interior knots and A = 1.0 4 . 
For the Gamma probability density function with a = 2, /3 = 2: 


f{x;a,p) = 


/3 Q F(a 


- x a - x e~ 


F(a) = 


„a— l „—x 


dx. 


the best model has 19 interior knots and A = 10 2 . 

For the Weibull probability density function with a = 1, /3 = 1.5: 


f(x;a,/3) 


a „q—1 r -(x/&')°‘ 

t: 

0 


x > 0 
x < 0, 


the best model has 19 interior knots and A = 10 3 . 

For the Pareto probability density function with a = 1, b = 1: 


f(x;a,b) 


( ab a )/(x a+1 ), x > b 
0 x < b, 


the best model has 17 interior knots and A = 10 4 . 
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Figure 1: Probability Density Estimation 


Duration analysis. Duration analysis [21] studies the spell of an event. Let 
f(t) be the density function of the probability distribution of duration. The 
survivor function S(t) = Pr(x > t) is the probability that the random variable 
x will equal or exceed t. The hazard function A (t) = f(t)/S(t ) is the rate at 
which spells will be completed at t. 
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The data in Figure [2] are strike spells and survivor and hazard estimates 
from m- The duration time are strike lengths in days between 1968 and 1976 
involving at least 1,000 works in the U.S. manufacturing industries with major 
issue. 

For the survivor function estimation, the best model has 19 interior knots 
and A = 10 -3 ; For the hazard function estimation, the best model has 6 interior 
knots and A = 10 -2 ; 




Figure 2: Duration Data 


Cost and production. The data in Figure[3]are the monthly production costs 
and output for a hosiery mill over a 4-year period from m- The production 
is in thousands of dozens of pairs, and the costs is in thousands of dollars. We 
downloaded the data from Larry Winner’s web site: http://www.stat.ufl. 
edu/~winner/data/millcost. dat 

For the production data, the best model has 18 interior knots, and A = 10” 2 . 
For the cost data, the best model also has 18 interior knots, and A = 10 -2 . 


17 






Figure 3: Mill Production/Costs 
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